Reference-Free Comparative Genomics of 174 Chloroplasts
نویسندگان
چکیده
Direct analysis of unassembled genomic data could greatly increase the power of short read DNA sequencing technologies and allow comparative genomics of organisms without a completed reference available. Here, we compare 174 chloroplasts by analyzing the taxanomic distribution of short kmers across genomes [1]. We then assemble de novo contigs centered on informative variation. The localized de novo contigs can be separated into two major classes: tip = unique to a single genome and group = shared by a subset of genomes. Prior to assembly, we found that ~18% of the chloroplast was duplicated in the inverted repeat (IR) region across a four-fold difference in genome sizes, from a highly reduced parasitic orchid [2] to a massive algal chloroplast [3], including gnetophytes [4] and cycads [5]. The conservation of this ratio between single copy and duplicated sequence was basal among green plants, independent of photosynthesis and mechanism of genome size change, and different in gymnosperms and lower plants. Major lineages in the angiosperm clade differed in the pattern of shared kmers and de novo contigs. For example, parasitic plants demonstrated an expected accelerated overall rate of evolution, while the hemi-parasitic genomes contained a great deal more novel sequence than holo-parasitic plants, suggesting different mechanisms at different stages of genomic contraction. Additionally, the legumes are diverging more quickly and in different ways than other major families. Small duplicated fragments of the rrn23 genes were deeply conserved among seed plants, including among several species without the IR regions, indicating a crucial functional role of this duplication. Localized de novo assembly of informative kmers greatly reduces the complexity of large comparative analyses by confining the analysis to a small partition of data and genomes relevant to the specific question, allowing direct analysis of next-gen sequence data from previously unstudied genomes and rapid discovery of informative candidate regions.
منابع مشابه
Cytonuclear coevolution: the genomics of cooperation.
Without mitochondria we would be in big trouble, and there would be a global biological energy crisis if it were not for chloroplasts. Fortunately, genomic evolution over the past two billion years has ensured that the functions of these key organelles are with us to stay. Whole-genome analyses have not only proven that mitochondria and chloroplasts are descended from formerly free-living bacte...
متن کاملCyanoClust: Protein Cluster Database for Comparative genomics of Cyanobacteria and Plastids
Chloroplasts are the sites of photosynthesis within the cell of land plants and algae. In non-photosynthetic tissues of plants, they are called plastids. Various lines of evidence, such as similarity of photosynthetic machineries and photosynthetic metabolic pathways as well as fossil records, suggested that the cyanobacteria are related to the origin of chloroplasts [1]. Now, cyanobacteria exh...
متن کاملChloroMitoCU: Codon patterns across organelle genomes for functional genomics and evolutionary applications
Organelle genomes are widely thought to have arisen from reduction events involving cyanobacterial and archaeal genomes, in the case of chloroplasts, or α-proteobacterial genomes, in the case of mitochondria. Heterogeneity in base composition and codon preference has long been the subject of investigation of topics ranging from phylogenetic distortion to the design of overexpression cassettes f...
متن کاملDatabase tool CyanoClust: comparative genome resources of cyanobacteria and plastids
Cyanobacteria, which perform oxygen-evolving photosynthesis as do chloroplasts of plants and algae, are one of the best-studied prokaryotic phyla and one from which many representative genomes have been sequenced. Lack of a suitable comparative genomic database has been a problem in cyanobacterial genomics because many proteins involved in physiological functions such as photosynthesis and nitr...
متن کاملCyanoClust: comparative genome resources of cyanobacteria and plastids
Cyanobacteria, which perform oxygen-evolving photosynthesis as do chloroplasts of plants and algae, are one of the best-studied prokaryotic phyla and one from which many representative genomes have been sequenced. Lack of a suitable comparative genomic database has been a problem in cyanobacterial genomics because many proteins involved in physiological functions such as photosynthesis and nitr...
متن کامل